Query Subtopic Mining via Subtractive Initialization of Non-negative Sparse Latent Semantic Analysis
نویسندگان
چکیده
Ambiguous and multifaceted queries widely exist in academic and commercial search engines. Identifying the popular subtopics of queries is an important issue for search engines. In this paper, we propose a novel method to discover the popular subtopics for a given query. Our method first constructs a search behavior tripartite graph based on the search log data. Then, we utilize a subtractive initialized Non-negative Sparse LSA model to mine subtopics from the tripartite graph. The experimental results on two real-world Chinese search logs (i.e., the CADAL search log and the Sogou search log) show that our proposed method can significantly outperform the other comparison methods in term of MAP, α-nDCG and S-recall. We also applied our method into the CADAL digital library to provide a novel faceted book search service which can reduce the users’ efforts in finding books.
منابع مشابه
Query expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملQuery Subtopic Mining by Combining Multiple Semantics
Query subtopic mining aims to find aspects to represent people’s potential intents for a query. Clustering query reformulations is the most common approach for subtopic mining these days. However, there are some challenges that the existing approaches have to face in finding both relevant and diverse subtopics, such as term mismatch and data sparseness. In this paper, a novel semantic represent...
متن کاملSparse Latent Semantic Analysis
Latent semantic analysis (LSA), as one of the most popular unsupervised dimension reduction tools, has a wide range of applications in text mining and information retrieval. The key idea of LSA is to learn a projection matrix that maps the high dimensional vector space representations of documents to a lower dimensional latent space, i.e. so called latent topic space. In this paper, we propose ...
متن کاملQualifier Mining for NTCIR-INTENT
We participated the Subtopic Mining subtask of NTCIR-9 INTENT task. Query Log is used as the primary resource to mine latent subtopics. Through analysis of query log, we observed that queries describing similar information needs will use a similar group of qualifiers, which may also frequently occur together within queries. We introduced the concept of qualifier graph for subtopic mining. To so...
متن کاملQuery Subtopic Mining Exploiting Word Embedding for Search Result Diversification
Understanding the users’ search intents through mining query subtopic is a challenging task and a prerequisite step for search diversification. This paper proposes mining query subtopic by exploiting the word embedding and short-text similarity measure. We extract candidate subtopic from multiple sources and introduce a new way of ranking based on a new novelty estimation that faithfully repres...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Inf. Sci. Eng.
دوره 32 شماره
صفحات -
تاریخ انتشار 2016